max rank | avg. rank | sentence |
---|---|---|
101 | 42.4167 | And he will do it for you from now on as well. |
105 | 56.4000 | If there were, we would know about it by now. |
107 | 50.6364 | You would think that there would be more they can do. |
130 | 57.7000 | And after all, that is all they need to know. |
132 | 31.2000 | I want you to have this one just for you. |
133 | 50.2727 | We think you will find all of that here, and more. |
134 | 52.6250 | You can still do that if you want to but you really have to want to. |
136 | 47.8333 | For them to do it to us, we had to be there. |
137 | 64.5000 | You can get more of their information here. |
138 | 51.4444 | But they all are not the same like other. |
138 | 51.6364 | I think you will see then not all are the same. |
144 | 65.6364 | We don't know if its this one or a different one. |
147 | 55.2000 | I think I will make some of these this year! |
149 | 88.2000 | They want them where they know they can find them. |
155 | 53.4375 | I love that you did this, because it is the same for so many of us! |
156 | 69.9231 | I can go in and find her, get her out before they know." |
157 | 51.0000 | The more I know of him, the more I want to know. |
161 | 45.8750 | He was one of the first to do so, but he would not be the last. |
162 | 75.0000 | This world has many who take what they can get. |
167 | 70.3077 | “There are some people who just get what they want in the world. |
168 | 60.3000 | He did not all all the the does not have. |
171 | 80.0000 | There will always be people who get, and people who don't get. |
182 | 51.3000 | This is the first time I think you are off. |
183 | 105.7500 | But they could always know something I don't. |
183 | 56.8889 | I would have said something to them about it. |
183 | 74.0000 | So now its time to do something about that. |
183 | 91.1250 | " You would have something the world should know. |
184 | 62.2727 | But if I am all of these things, what are you? |
184 | 74.3636 | There are so many great things about it that I love. |
184 | 88.7500 | We all like different things, and different people. |
The maximum word rank of a sentence is by definition the rank of the rarest word in the sentence. If it is low, all words in the sentence are of high frequency. For this reason the table of the sentences with least maximum word number might be of interest. In the table, we see the corresponding sentences with a minimum length of 40 characters.
The over all distribution of the maximum rank in all sentences of the corpus is shown in a diagram with log-scaled x-axis.
The sentences in the table described above are of interest because they are usually easy to understand. The distribution may give insights into the corpus and may give parameters for language comparison.
While the distribution might be deduced from a small corpus, the sentences in the table are rare and a large corpus will give more impressive results.
Table data:
select max(w_id)-100 as m, avg(w_id)-100 as a, s.sentence from sentences s, inv_w i where s.s_id=i.s_id and length(sentence)>40 and i.w_id>100 group by s.s_id order by m limit 30;
Distribution data;
select m, count(*) from (select 100* round((max(w_id)-100)/100) as m from sentences s, inv_w i where s.s_id=i.s_id and i.w_id>100 group by s.s_id) aa group by m;
Explain the distribution, especially the increase in its right part.
4.5.2.2 Average word rank in sentence
4.5.2.3 Sentences consisting of many low frequency words I
4.5.2.4 Sentences consisting of many low frequency words II
4.5.2.5 Sentences consisting of short words only I
4.5.2.6 Sentences consisting of short words only II
4.5.2.7 Sentences consisting of long words only I
4.5.2.8 Sentences consisting of long words only II